forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tau memory_instrumentation (v4.0.x) #7
Open
naughtont3
wants to merge
147
commits into
v4.0.x
Choose a base branch
from
tjn-tau-meminstr-v4-0-x
base: v4.0.x
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If UCX is available, then pml/ucx will be used instead of pml/ob1 + btl/openib, so there is no need to warn about btl/openib not supporting Infiniband. Signed-off-by: Gilles Gouaillardet <[email protected]> (cherry picked from commit open-mpi/ompi@0a2ce58)
…led. Fixes an issue introduced in open-mpi/ompi@0a2ce58 This is a one-off commit for the v4.0.x branch since btl/openib has been removed from master. Refs. open-mpi#6137 Signed-off-by: Gilles Gouaillardet <[email protected]>
Many thanks to Sergey Oblomov for reporting this issue and the countless traces provided when troubleshooting it. This is a one-off commit for the v4.0.x branch since btl/openib has been removed from master. Refs. open-mpi#6137 Signed-off-by: Gilles Gouaillardet <[email protected]>
In case of using a btl_put in ob1, the handle of the locally registered memory is sent with a PUT control message. In the current master code the sent handle is necessary the handle in the frag but if the handle has been successfully registered in the request, the frag structure does not have any valid handle and all fragments use the request one. I suggest to check if the handle in the fragment is valid and if not to send the handle from the request. Signed-off-by: Brelle Emmanuel <[email protected]> (cherry picked from commit e630046)
The rdma_frag attached to the send request was not correctly released upon request completion, leaking until MPI_Finalize. A quick solution would have been to add RDMA_FRAG_RETURN at different locations on the send request completion, but it would have unnecessarily made the sendreq completion path more complex. Instead, I added the length to the RDMA fragment so that it can be completed during the remote ack. Be more explicit on the comment. The rdma_frag can only be freed once when the peer forced a protocol change (from RDMA GET to send/recv). Otherwise the fragment will be returned once all data pertaining to it has been trasnferred. NOTE: Had to add a typedef for "opal_atomic_size_t" from master into opal/threads/thread_usage.h into this cherry pick (it is in opal/include/opal_stdatomic.h on master, but that file does not exist here on the v4.0.x branch). Signed-off-by: George Bosilca <[email protected]> (cherry picked from commit a16cf0e) Signed-off-by: Jeff Squyres <[email protected]>
…user If user sets HCOLL_EXTERNAL_UCM_EVENTS=1 then we try init opal memory framework and register a mem release cb. Otherwise, rely on ucx. Signed-off-by: Valentin Petrov <[email protected]>
Signed-off-by: Nathan Hjelm <[email protected]> (cherry picked from commit 3e1dd36)
- added MPI based implementation of shmem_collect call Signed-off-by: Sergey Oblomov <[email protected]> (cherry picked from commit 7d8cb75)
- in some cases realloc operation may be completed without allocation of new buffer (and without additional data copy) - added logic to reallocate buffer inplace if possible Signed-off-by: Sergey Oblomov <[email protected]> (cherry picked from commit 277c2a9)
Signed-off-by: Sergey Oblomov <[email protected]> (cherry picked from commit a51badd)
Signed-off-by: Sergey Oblomov <[email protected]> (cherry picked from commit d6a0912)
- added synchronized flush operation on quiet call. - flush is implemented using get operation Signed-off-by: Sergey Oblomov <[email protected]> (cherry picked from commit 0b10841)
Example Usage: (With threading safety)
|
This is mostly based off recent UCX additions to their patcher: openucx/ucx#2703 They added triggers for * mmap when (flags & MAP_FIXED) && (addr != NULL) * shmat when (shmflg & SHM_REMAP) && (shmaddr != NULL) Beyond that I noticed they already had a trigger for * madvise when (advice == MADV_FREE) that we didn't so I added that. And the other main thing is we didn't really have shmat/shmdt active for some systems because we only had a path for syscall(SYS_shmdt, ) but we needed to also have a path for syscall(SYS_ipc, IPCOP_shmdt, ) and same for shmat. Signed-off-by: Mark Allen <[email protected]> (cherry picked from commit eb88811)
naughtont3
force-pushed
the
tjn-tau-meminstr-v4-0-x
branch
from
May 30, 2019 20:15
d5bb27d
to
82930b4
Compare
Refreshed for latest upstream v4.0.x branch (4a7f6a4) |
Signed-off-by: Yong Qin <[email protected]>
…et-v4.0 SPML/UCX: added synchronized flush on quiet - v4.0
Use the PVAR ctx to save the SPC index, so that no lookup nor restriction on the SPC vars position is imposed. Make sure the PVAR are always registered. Signed-off-by: George Bosilca <[email protected]>
Signed-off-by: George Bosilca <[email protected]>
Signed-off-by: George Bosilca <[email protected]>
…rom_6668 btl/uct: check for support before disabling UCX memory hooks
shmat/shmdt additions for patcher
V4.0.x Coll/hcoll: don't init opal memhooks unless explicitely requested
…warning btl/openib: delay UCX warning to add_procs()
Remove the debruijn component as it changes the daemon's parent process ID, thus breaking the other routed components Signed-off-by: Ralph Castain <[email protected]>
Signed-off-by: Ralph Castain <[email protected]>
…with [u]int32_t and [u]int64_t Signed-off-by: Scott Miller <[email protected]> (cherry picked from commit ca59cab)
…collect-v4.0 SSHMEM/COLL: added sshmem/mpi implementation for shmem_collect call - v4.0
use strtol() instead of atoi() in order to handle hostnames containing a large number. This is a one-off commit for the release branches since the regx framework has already been removed from master. Refs. open-mpi#6729 Signed-off-by: perrynzhou <[email protected]>
Updating NEWS for v4.0.2
Refs openpmix/openpmix#1413 Signed-off-by: Ralph Castain <[email protected]>
open-mpi/ompi@0fe756d Introduced a bug in coll/hcoll component. The ompi_requests allocated by libhcoll would be treated as coll_base_nbc_request during ompi_coll_base_retain_<> call. Afterwards this would lead to a segv in the request cleanup. Fix: since libhcoll interface does not distinguish between the blocling/non-blocking requests use coll_base_nbc_request all the time and initialize it properly in coll/hcoll/get_coll_handle(). It is still within 2 cache lines. Signed-off-by: Valentin Petrov <[email protected]>
These variables were renamed in 904276b; update them to use the new names. Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit 2ab8109)
…request_bugfix V4.0.x Coll/hcoll: fixes hcoll non-blocking colls support
Signed-off-by: Joseph Schuchart <[email protected]> (cherry picked from commit 08cb638)
…able-names-in-make-check v4.0.x: Update OPAL DDT variable names
UCX osc: properly release exclusive lock to avoid lockup (v4.0.x)
* Fix open-mpi#6618 - See comments on Issue open-mpi#6618 for finer details. * The `plm/rsh` component uses the highest priority `routed` component to construct the launch tree. The remote orted's will activate all available `routed` components when updating routes. This allows the opportunity for the parent vpid on the remote `orted` to not match that which was expected in the tree launch. The result is that the remote orted tries to contact their parent with the wrong contact information and orted wireup will fail. * This fix forces the orteds to use the same `routed` component as the HNP used when contructing the tree, if tree launch is enabled. Signed-off-by: Joshua Hursey <[email protected]>
This commit fixes issue open-mpi#6853 by removing MacOS/Darwin-specific logic from intercept_mmap. Signed-off-by: Harumi Kuno <[email protected]>
v4.0.x: regx/naive: add regx/naive component
v4.0.x: Fix mmap infinite recurse in memory patcher
Remove unnecessary error log
Signed-off-by: guserav <[email protected]> (cherry picked from commit 3c9f4e6)
…it-atomics v4.0.x: Fix osc sm posts when only 32 bit atomics support v4.0.x
This patch fixes the merge of contiguous elements into larger but more compact datatypes, and allows for contiguous elements to have thir blocklen increasing instead of the count. The idea is to always maximize the blocklen, aka. the contiguous part of the datatype. Signed-off-by: George Bosilca <[email protected]> (cherry picked from commit 41e6f55)
elements that can be merged into a larger UINT1 type. Signed-off-by: George Bosilca <[email protected]> (cherry picked from commit 82d6322)
If both types of interfaces are enabled, don't error out if one of them isn't able to open listener sockets. Only one interface family may be available on some machines, but someone might want to build the code to run more generally. Refs openpmix/prrte#249 Signed-off-by: Ralph Castain <[email protected]> (cherry picked from commit 06d188e)
v4.0.x: Be a little less restrictive on interface requirements
v4.0.x: Datatype optimization and fix
Fix tree spawn routed component issue
Reving VERSION to v4.0.2rc2 Signed-off-by: Geoffrey Paulsen <[email protected]>
Reving VERSION to v4.0.2rc2
There was a restructuring of the comm/win/group objects that shifted the fields down for getting at the class name. This appears to relate to opal_infosubscriber_t changes from 50aa143. NOTE: The opal_infosubscriber_t changes were not applied to the oshmem_group_t structure, so no need for changes there.
Guard these using existing OPAL_ENABLE_MEM_PROFILE option that is enabled via '--enable-mem-profile' configury.
naughtont3
force-pushed
the
tjn-tau-meminstr-v4-0-x
branch
from
September 17, 2019 17:28
e9245a4
to
2ac51fe
Compare
NOTE - I should probably create a new PR at this point, but the branch is still the correct one for this item. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See https://github.com/ParaToolsInc/ompi.git